University of Pittsburgh at GeoCLEF 2008: Towards Effective Geographic Information Retrieval

نویسندگان

  • Qiang Pu
  • Daqing He
  • Qi Li
چکیده

This paper reports University of Pittsburgh’s participation in GeoCLEF 2008. As the first time participants, we only worked on the monolingual GeoCLEF task and submitted four runs under two different methods. Our GCEC method aims to test the effectiveness of our online geographic coordinate extraction and clustering algorithm, and our WIKIGEO method wants to examine the usefulness of using the geo-coordinate information in Wikipedia for identifying geo-locations. Our experiments results show that: 1) our online geographic coordinate extraction and clustering algorithm is useful for the type of locations that do not have clear corresponding coordinates; 2) the expansion based on the geo-locations generated by GCEC is effectiveness in improving Geographic retrievals. 3) Using Wikipedia we can find the coordinates for many geo-locations, but its usage for query expansion still need further studies. 4) query expansion based on title only obtained better results than using the combination of title and narrative parts, which are thought to contain more related geographic information. Further study is need for this part too.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

GeoCLEF: the CLEF 2005 Cross-Language Geographic Information Retrieval Track

Introduction GeoCLEF is a new track for CLEF 2005. GeoCLEF was run as a pilot track to evaluate retrieval of multilingual documents with an emphasis on geographic search. Existing evaluation campaigns such as TREC and CLEF do not explicitly evaluate geographical relevance. The aim of GeoCLEF is to provide the necessary framework in which to evaluate GIR systems for search tasks involving both s...

متن کامل

University of Hagen at GeoCLEF 2008: Combining IR and QA for Geographic Information Retrieval

This paper describes the participation of GIRSA at GeoCLEF 2008, the geographic information retrieval task at CLEF. GIRSA is a modified and improved variant of the system which participated at GeoCLEF 2007. It combines results retrieved with methods from information retrieval (IR) on geographically annotated data and question answering (QA) employing query decomposition. For the monolingual Ger...

متن کامل

GeoCLEF 2008: the CLEF 2008 Cross-Language Geographic Information Retrieval Track Overview

GeoCLEF is an evaluation initiative for testing queries with a geographic specification in large set of text documents. GeoCLEF ran a regular track for the third time within the Cross Language Evaluation Forum (CLEF) 2008. The purpose of GeoCLEF is to test and evaluate cross-language geographic information retrieval (GIR). GeoCLEF 2008 consisted of two sub tasks. A search task ran for the third...

متن کامل

Re-Ranking for Geo-Relevance With Non-Contextual Heuristics at GeoCLEF 2007

Geographic Information Retrieval (GIR) in an attempt to improve relevance by taking geographic information in textual documents into account. We describe out experiments carried out at the GeoCLEF 2007 evaluation [1] that investigate further the role of geo-filtering based re-ranking and query expansion with geographic terms. Our main findings are that manual query expansion with geo-terms is m...

متن کامل

The University of Lisbon at GeoCLEF 2006

This paper details the participation of the XLDB group from the University of Lisbon at the GeoCLEF task of CLEF 2006. We tested text mining methods that make use of an ontology to extract geographic references from text, assigning documents to encompassing geographic scopes. These scopes are used in document retrieval through a ranking function that combines BM25 text weighting with a similari...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2008